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IN THE CLAIMS 

Please amend or retain the claims as follows: 

1. (Currently Amended) A method of setting up a DLSI space-based classifier to be 
stored in a computer storage device for document classification using a computer, and classifying 
using said classifier by said computer to classify a document according to a plurality of clusters 
within a database^ using said classifi e r comprising the steps of: 

preprocessing documents using said computer to distinguish terms of a word and a noun 
phrase from stop words; 

constructing system terms by setting up a term list as well as global weights using said 
computer ; 

normalizing document vectors of collected documents, as well as centroid vectors of each 
cluster using said computer ; 

constructing a differential term by intra-document matrix D™*" 1 using said computer , 
such that each column in said matrix is a differential intra-document vector; 

decomposing the differential term by intra-document matrix D f 9 by an S VD algorithm 
using said computer , into Dj = UjSjV^Sj = diag(5 Itl9 S ra ,- • •)) , followed by a composition of 
Dj k = U k S kj V k/ giving an approximate D f in terms of an appropriate k f ; 

setting up a likelihood function of intra-differential document vecto r using said computer ; 

constructing a term by extra-document matrix D E X " E using said computer , such that each 
column of said extra-document matrix is an extra-differential document vector; 

decomposing D E , by exploiting the SVD algorithm using said computer , into 
D E = U E S E V E (S E = diag{5 E V S E 2 s • •)) , then with a proper k E , defining D E k£ = U kE S k V£ & to 
approximate D E ; 
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setting up a likelihood function of extra-differential document vector using said 
computer ; 

setting up a posteriori function using said computer ; and 

classifying, on th e basis of the said computer using said DLSI space-based classifier, as 
set up in the foregoing steps, to classify the document as belonging to one of the plurality of 
clusters within the database. 

2. (Currently Amended) An automatic document classification method using a 
DLSI space-based classifier operating on a computer as a computerized classifier to classify a 
document in accordance with clusters in a database, comprising the steps of : 

a) setting up , by said computerized classifier, a document vector by generating terms as 
well as frequencies of occurrence of said terms in the document, so that a normalized document 
vector N is obtained for the document; 

b) constructing, by said computerized classifier, using the document to be classified, a 
differential document vector x = N - C , where C is the normalized vector giving a center or 
centroid of a cluster; 

c) calculating an intra-document likelihood function P(x \ D f ) for the document using 
said computerized classifier ; 

d) calculating an extra-document likelihood function P(x \ D E ) for the document using 
said computerized classifier ; 

e) calculating a Bayesian posteriori probability function using said computerized 
classifier P(Dj \ x) ; 

f) repeating, for each of the clusters of the data base, steps b-e; 
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g) selectin g, by said computerized classifier, a cluster having a largest P(D f \ x) as the 
cluster to which the document most likely belongs; and 

h) classifyin g, by said computerized classifier, the document in the selected cluster 
within said database thereby categorizing said document for an automated document retrieval 
system . 

3. (Original) The method as set forth in claim 2, wherein the normalized document 
vector N is obtained using an equation, b i} - a i} j ^^ a lj • 

4. (Currently Amended) A method of setting up a DLSI space-based classifier to be 
stored in a computer storage device for document classification using a computer, and classifying 
using said classifier by said computer to classify a document according to a plurality of clusters 
within a database^ using said classifi e r comprising the steps of: 

setting up , using said computer, a differential term by intra-document matrix where each 
column of the matrix denotes a difference between a document and a centroid of a cluster to 
which the document belongs; 

decomposin g, using said computer, the differential term by intra-document matrix by an 
SVD algorithm to identify an intra-DLSI space; 

setting up , using said computer, a probability function for a differential document vector 
being a differential intra-document vector; 

calculatin g, using said computer, the probability function according to projection and 
distance from the differential document vector to the intra-DLSI space; 
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setting up , using said computer, a differential term by extra-document matrix where each 
column of the matrix denotes a differential document vector between a document vector and a 
centroid vector of a cluster which does not include the document; 

decomposin g, using said computer, the differential term by extra-document matrix by an 
SVD algorithm to identify an extra-DLSI space; 

setting up , using said computer, a probability function for a differential document vector 
being a differential extra-document vector; 

setting up , using said computer, a posteriori likelihood function using the differential 
intra-document and differential extra-document vectors to provide a most probable similarity 
measure of a document belonging to a given cluster; and 

classifying, said computer using said DLSI space-based classifier, as set up in the 
foregoing steps, to classify the document as belonging to one of the plurality of clusters within 
the database. 



5. (Original) The method as set forth in claim 4, wherein the step of setting up a 
probability function for a differential document vector being a differential intra-document 
vector is performed using an equation, 



n) 12 exp 



P(x\D f ) = 



-fit- 



2pj 



/=1 



D, 



where y = U T k x, s 2 (x) =\\ x \\ 2 ~^]yf , p f = 5 md r i is the rank of matrix 

1 1=1 r r ~k/ i=k[+\ 
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6. (Original) The method as set forth in claim 4, wherein the step of setting up a 
probability function for a differential document vector being a differential extra-document vector 

is performed using an equation, 

1/2 / n F & y] . , n F € 2 (x). 
n E n exp( — / 7~) • exp( — E J ) 
E V 2 fas* 2p E 
P(x I D E ) = 4^ £2— , 

(2^r /2 n^,-^- i£)/2 

where .y = C/^jc, s 2 (x) =\\ x \\ 2 ~^yf , P E = — zl^lj , ^ is the rank of matrix D E . 

7. (Original) The method as set forth in claim 4, wherein the step of setting up a 
posteriori likelihood function is performed using an equation, 

PiD lx) = 

71 P{x\D J )P{D I )^P{x\D E )P{D E Y 
where P(D f ) is set to l/« c where n c is the number of clusters in the database and 
P(D E ) is set to 1 -/>(£>/). 

8. (Currently Amended) The method as set forth in claim 4, the step of classifying 
the docum e nt said computer using the DLSI space-based classifier to classify the document 
comprising the steps of: 

a) setting up a document vector by generating terms as well as frequencies of 
occurrence of said terms in the document, so that a normalized document vector N is obtained 
for the document; 

b) constructing, using the document to be classified, a differential document vector 
x = N-C , where C is the normalized vector giving a center or centroid of a cluster; 

c) calculating an intra-document likelihood function P(x \ Dj) for the document; 

d) calculating an extra-document likelihood function P(x \ D E ) for the document; 
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e) calculating a Bayesian posteriori probability function P(D f | x) ; 

f) repeating, for each of the clusters of the data base, steps b-e; 

g) selecting a cluster having a largest P{D I \ x) as the cluster to which the document 
most likely belongs; and 

h) classifying the document in the selected cluster. 



9. (Original) The method as set forth in claim 8, wherein the normalized document 



vector TV is obtained using an equation, by - a i} j 
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